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A  Note  on  Average  Mutual  Information 
for  Spherically  Invariant  Processes 


1,  Introduction 

The  actual  transmission  capacity  of  a  given  channel  is  a  parameter  of 
basic  importance  in  any  communication  system  since  it  fixes  limits  to  the 
rate  at  which  information  can  be  transmitted  reliably.  There  has  thus  been 
an  effort  which  started  with  Shannon  (1948)  to  compute  the  capacity  of 
transmission  for  different  channel  and  transmission  models.  In  the  case  of 
a  continuous  channel,  most  results  have  been  obtained  for  a  Gaussian  noise 
(Baker[1978] ;  Hitsuda  and  Ihara[1975];  Kadota,  Zakai  and  Kiv[1971]).  Some 
attempts  to  steer  away  from  the  Gaussian  case  have  also  been  made  (Gualtierotti 
[1980])  and  these  indicate  that  new  methods  may  be  required.  Indeed,  in  the 
Gaussian  case,  most  quantities  of  interest  can  be  explicitly  obtained  whereas 
these  computations  are  almost  always  impossible  in  other  instances.  Further¬ 
more  the  computation  of  mutual  information  requires  that  the  joint  law  of 
the  message  and  the  received  signal  be  absolutely  continuous  with  respect  to 
the  marginals  and  that  the  Radon-Nikodym  derivative  be  computed:  though  the 
Gaussian  case  is  well  known  (Baker[1973]) ,  this  knowledge  is  again  unavailable 
for  most  other  models. 

Spherically  invariant  distributions  are  mixtures  of  Gaussian  ones  and 
through  mixing  a  nun^er  of  well  known  distributions  can  be  obtained,  such  as 
the  double  e:q)onential  and  the  student  distributions  (Keilson  and  Steutel, 
[1974]).  There  is  also  evidence  that  some  real  life  noises  can  be  described 
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through  spherically  invariant  probabilities,  particularly  in  underwater 
acoustics.  It  is  thus  natural  to  investigate  the  problems  of  absolute 
continuity,  of  calculation  of  mutual  information  and  channel  capacity  for 
spherically  invariant  noises.  This  is  the  subject  of  the  present  paper. 

We  obtain  a  formula  for  average  mutual  information  when  the  mixing  measure 
is  discrete  with  finite  support. 


2.  Preliminaries 

We  give  here  the  basic  definitions  and  a  number  of  useful  lemmas  which 

can  be  easily  checked  from  first  principles. 

and  H2  are  real  and  separable  Hilbert  spaces  with  respective  inner 

products  (u^,v^^j,  u^v^eHj^,  and  {u^,v^)2,  u^,v^£H2.  H  is  the  set  x 

*^12  12  12 
and  its  elements  are  denoted  =  (u^,u^).  If  lij  =  (u'^,u  )  and  Ti2  =  (v'^,v"^), 

let  <‘,>>:HxH  -►IR  be  the  map  defined  by  the  relation 

{hj,it2)  =  (u\v^)j  +  (u^v^)2. 


<•,•>  is  an  inner  product  on  H  and  with  this  inner  product  it  is  a  real  and 

separable  Hilbert  space.  p„  is  the  projection  with  range  H,x{0},  P„  that 

1  2 


with  range  {OjxH^. 


Lemma  1 :  Let  Jj^;H  -►  Hj  be  defined  by  Jj  (u^,u^)  =  u^,  i  =  1,2.  Then 

1)  Jju^  =  (u\o);  J‘u^  =  C0,u^). 

2)  jp,  =  Ph^;  J-J2  =  Ph^. 

3)  =  id„^;  =  id„^. 


8[H],  8[Hj^]  and  8[H2]  are  the  Borel  sets  of  respectively  H,  and  Then 
Lemma  2 :  8[H]  =  8[Hj]  »  8[H2]. 

If  P  is  a  probability  measure  on  H,  =  PoJ^^,  i=l,2,  are  probabilities 

on  8[H^],  i=l,2.  They  are  called  the  marginals  of  P.  The  product  of  these 

»  12 

marginals  is  denoted  either  P  or  P  ®P  according  to  convenience.  It  is 
defined  on  H. 

A  Gaussian  probability  P  on  any  real  and  separable  Hilbert  space  H  is 
determined  by  its  characteristic  function  i(ip  or  its  mean  m  and  covariance  R. 
m  belongs  to  H  and  is  identified  by  the  relation 

(h.m)  =  /  (h.x)  P(dx). 

H 

R  belongs  to  the  nonnegative  and  self-adjoint  operators  on  H  which  have 

finite  trace  and  is  identified  by  the  relation 

^h,Rk)  =  /  {h,x-m)(k,x-m^  P(dx). 

H 

m,  R  and  <f)p(h)  =  exp{i<h,m>-^  <h,Rh>}. 

Lemma  3:  Let  again  H=HjXH2  and  P  be  a  Gaussian  probability  on  B[H]  with 
12 

mean  m  =  (m  ,m  )  and  covariance  R.  Then: 

1)  P^  is  Gaussian  with  mean  m^  and  covariance  R.=J.RJ?,  i=l,2. 

Ill 

2)  P®  is  Gaussian  with  mean  m'^  and  covariance  R®  =  p„  Rp„  +  p„  Rp„  . 

H  j  ”2  ^2 

If  H  is  any  real  and  separable  Hilbert  space  and  a>0,  T  :H-*'H  defined 
by  T^h=ah  is  a  homeomorphism  of  H,  so  that  P^T^’^  is  well  defined.  This 
measure  is  written  P  or  P(a,‘)  according  to  convenience. 

A 

2 

Lemma  4:  P  is  Gaussian  with  mean  am  and  covariance  a  R,  P(a,B)  =  P  (B) , 
Be8[H],  is  a  transition  function  defined  on  tO,<»[x8[H] . 
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Let  F  be  a  probability  measure  on  SpR^].  A  spherically  invariant 
measure  on  H  is  a  probability  Q  of  the  form 

Q(B)  =  7p(a,B)  F(da)  ,  BeB[H]. 


Lemma  5:  Let  again  H=Hj^xH2.  Then; 

1)  =  P^oT“^. 

'a  a 

2)  p^  «  p^  =  P*oT'^,  written  P®. 

a  a  a  *  a 

8  12  8 

3)  The  covariance  R  ,  of  the  measure  P  eP,  written  P  ,  is  given  by 


the  formula 


a,b 


2 

=  a  p 


h,^Ph 


b^p 


H, 


RPu 


4)  Q^B)  =  /  P^a.B)  F(da),  Be8[H.). 

5)  Q®(B)  =  “7  P*  .  (B)  F8F(da,db)  . 

O  O 

Lemma  6 ;  Let  be  probability  measures  on  (fi,  A) 

such  that  P.  ±  Q. ,  lsi<m,  lsj<n.  Then  there  exists  AcA  such  that 
^  J  • 

P^CA)  =  =  0,  l<ism,  l<j5n, 

proof:  For  measures  A,  p,  v  such  that  Xiv  and  yiv,  X+yiv  (Ash,  p.67).  By 
recursion,  one  obtains  that 


m 

I  P- 

i=l  ^ 


m 

^  I  Q.. 

j=i  ^ 


m  n 

A  is  a  set  such  that  (  [  P.)(A)  =  (  I  QJCA*^)  =  0. 

i=l  ^  j=l  ^ 
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3.  The  case  of  a  mixing  measure  F  with  finite  support 


A.  Absolute  continuity 

Lemma  7:  Suppose  that  neither  nor  R2  are  zero  and  that  b  and  c 
are  not  both  equal  to  a.  Then 

P®  1  P®  . 
a  b  ,c 

proof :  If  a=0,  or  if  a>0  and  either  b  or  c  is  zero,  one  has  in  the 


do 


first  case  R*  =  0,  R?  0  and,  in  the  second,  R*  =  a^  {p„  1^„  +  p^.  Rp,,  } 

“  a  H2  H2 

not  then  have  the  same  range  and,  since  and  P,  are  Gaussian,  they  must 

n  D ,  c 

be  orthogonal  (Rao-Varadorajan,  1963).  If  a,  b  and  c  are  all  positive  and  if 

e  » 

P^  and  Pj^  ^  are  not  orthogonal,  they  must  be  equivalent  since  they  are  Gaussian 
(Rao-Varadarajan,  1963).  But  then,  from  the  same  reference,  we  have  that 

where  T  is  Hilbert-Schmidt  with  spectrum  o(T)>-l.  However  T  can  be  identified 

2  2 

as  the  "diagonal"  operator  with  diagonal  elements  C[b  /a  ]-l)idL,  and 

“1 

2  2 

([c  /a  ]-l)idi,  leading  to  a  contradiction. 

2 


9 

Lemma  8:  Let  a,  b  and  c  be  positive.  Then,  if  P  and  P  are  orthogonal, 

so  are  P  and  P^  . 
a  b,c 

proof:  Since  P_  and  P*  _  are  Gaussian,  if  they  are  not  orthogonal,  they 

3  D  ,  C 

must  be  equivalent  (Rao-Varadarajan,  1963).  But 
P^  ~  N(0,  a^R) 

3 

and  • 

so  that  (Rao-Varadarajan,  1963), 


T  Hilbert -Schmidt,  a(T>-l. 
By  Lenrnia  5,  3) 

PHj\,cPH2  " 

and,  equivalently,  using  (1), 

(2) 


'  “• 


Furthermore,  by  Lemma  3,  2), 


R®  -  R  =  -{p„  HPu  +  Pp  RP„  }. 

"l  "2  "2  "1 


so  that,  using  (2) ,  one  gets 
(3) 


R*-R  =  P  R^^^TR^/^P  +  P  R^/^TR^/^P  . 

11^  ^2  ^2 


By  Lemma  5,  3)  ,  and  the  assvunption  b>0, 

and,  by  (1), 

K,c^’  h  >  ^  a^ll  UTll  (Rh,  I  ). 
Consequently,  if  K  is  ein  appropriate  constant, 
(Ph  Rpj^  h,  t  )  s  K(Rh,  h  ). 


n  Rr,  -  r1/2 


Thus  (Douglas ,  1966) 

(4) 

where  U  is  bounded,  nonnegative  and  self-adjoint 
Similarly,  one  has 

(5)  P„  RP  = 

^^2  “2 


where  V  is  bounded,  nonnegative  and  self-adjoint 
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The  polar  decoiqjosition  (Weidmann,  p.l97)  yields 

C6) 

where  A  is  a  partial  isometry  such  that: 

1/2 

AA*  =  p,  .  L  =  closure  of  range  of  R  Pu  » 

Li  1 

1/2 

A*A  =  p,  »  =  closure  of  range  of  (P„  RPu  ) 

Similarly,  one  has 

(7) 

(6)  yields  A*R^^^p  =  (p  Kp„  and  (7)  yields 

Consequently,  one  has  from  (4)  that 

This  can  be  rewritten  as 

(8)  p„ 

“l 

Similarly,  one  has 

(9)  Pu 

^2 

Using  (8)  and  (9)  in  (3),  one  gets,  if  f  =  U^^^BA*TCD*V^^^ , 

(10)  R®  =  R^^^(I+f+f*)R^'^^. 

Let  us  show  that 

(11)  a(f+f*)>-l 

or  equivalently,  that 


*4®  and  y'S  have  the  same  range. 


One  has 


0  s  min(b^,c^)  R®  s  nf  ^  s  max(b^,c^)  if. 

D  » C 

so  that  and  c  have  the  same  range  (Douglas,  1966).  By  assumption, 
and  Pjj  ^  are  equivalent  and  Gaussian  so,  (Rao-Varadarajan,  1963), 
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and  / a^R  have  the  same  range.  Then  (12)  is  established. 

From  (10),  (11)  and  (Rao-Varadarajan,  1963),  one  has  that  P  and  P 
are  equivalent,  contradicting  the  assunptions. 

Let  now  0saj^<a2<. . .  <a^  be  the  support  of  F  and  suppose  F  has  mass 


p. e]0,l[  at  a.,  l<i<n.  We  write  P. ,P. ,P.  .  for,  respectively,  P  ,P  ,P^ 

1  1  1  1  i,j  a^  a^  a^,aj 


Proposition  1 :  If  P<<P  ,  Q<<Q 


proof :  Since  P<<P®,  lsi<n.  So,  ifQ®(B)=0,  pf(B)=0,  l^i<n,  using 


Lemma  5,  5).  But  then  P^(B)=0,  l<i^n,  that  is  Q(B)=0. 


9  9  09 

Proposition  2 :  If  P  i  P  and  aj^>0 ,  Q  j.Q  .  If  aj^=0,  Q  and  Q  cannot  be  orthogonal 
proof:  If  aj=0,  Q({0})>0  and  Q®({0})>0.  So,  if  Q(B)=0,  b‘^2{0}  and  Q®(b‘^) >0. 

If  a.>0,  then,  by  Lemma  8,  P_lP.  ,  ,  lsi,j,k<n.  The  result  then  follows  by 

1  1  J  ,K 


Lemma  6. 


Proposition  3:  If  QlQ  ,  PiP®. 

proof :  Let  Q(B)=Q®(b'^)=0.  Then  P^(B)=pf(B*^)=0,  IsiSn.  But,  by  Lemma  5,  1) 
and  2,  P.  (B)=P{T‘^(B)}and  P®(b'^)  =  P®{  [T:^(B) 

^  cL  •  X  A  • 


Proposition  4:  If  Q<<Q®  and  aj>0,  P«P'°’. 

9 

proof :  Suppose  P  is  not  absolutely  continuous  with  respect  to  P  .  Since  P  and 

9  9 

P  are  Gaussian,  P  and  P  are  then  orthogonal  (Rao-Varadarajan,  1973).  But,  by 
Proposition  2,  one  has  that  QxQ*.  So,  if  Q(B)=Q*(b'^)=0,  Q(b'^)=0  by  assumption 
and  Q  is  identically  zero  and  cannot  be  a  probability. 


Remark :  Q<<Q  does  not  imply  Q=Q 


n  n 


proof :  Let  Q=^p.P?.  Then  Q*  =  ^  |  P.P  Pf  .. 

i=l^  ^  i=l  j=l^  J 


Use  Lemma  7  and  6  to  find  a  Borel  set  B  such  that 


P?(B)  =  pf  ^(B*^)  =  0,  lsi,j,ksn,  j»:k. 
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n  n 


ThenQ(B)=0,  but  Q®(B)  =  1  I  p.p.>0. 

i=l  j=l  ^ 


B.  The  Radon-Nikod^  derivative  of  Q  with  respect  to  Q 

Proposition  5:  If  aj^>0  and  Q<<Q  >  there  exist  Borel  sets  B^,  l<i<n,  such  that 

^2-.  fi-d-i 

dQ'  1=1 Pi  “l  dP* 

proof :  For  each  fixed  ie{l,...,n}  choose,  with  the  help  of  Lemma  7  and  Lemma  6, 

a  Borel  set  B.  such  that 
1 

P  (B  }  =  P*  .(bJ)  =  0.  l<j.k<n,  -j.k)  *  (i.i). 

11  j  1 

9 

From  Proposition  4  and  Lemma  5,  1)  and  2),  v#-  iso  have  that  l<i<n. 


Set 


dP. 


1  ui  • 

A  •  I  -  (i-ij )  ^ 

i 


i=lPi  “i  dPi 


Then 


But 


/  A  dQ“ 

B 


n  n 


=  i  i  PiPj/AdP®  .. 
i=l  j  =  l  ^B 


/  A  dP.  . 
B 


=  dP®  .  ^ 

1^1  BdB'  XPk 
k 

and  P®  j(B^)  =  0  for  (i,j)  ^  (k,k), 

Thus,  if  ij^j  ,  /AdPf  .  *  0,  and,  if  i=j , 

R  J 


/AdPT  . 
B 


1  «  ^^i 

=  —  /  dP?  — i 

i  BOB?  ^  dpf 

=  —  P.  CBrtB?) 

Pi  1 
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Consequent  ly, 


/AdQ  =  I  p. P. (B)  = 
B  i=l  ^ 


Q(B).  or  A  = 


_  dQ 


dQ 


C.  Calculation  of  mutual  information 

If  P  is  any  probability  measure  on  H,  I(P),  the  average  mutual  information 
of  P,  is  given  by  the  formula 


HP 

I(P)  =  /  log  ^  dP. 

H  dP 

e 

provided  P<<P  .  The  entropy  of  F,  H(F) ,  is  the  quantity  H(F)  =  -  J  P-log  p. . 
One  has : 

Proposition  6:  I(Q)  =  H(F)  +  I(P) 


proof :  Let  e^  be  the  eigenvector  of  R  with  associated  eigenvalue  Let: 


>i/  (ft)  =  1  f  }  .  -V. 
n'-  n  \ 


i=l  n 


=  TTF  , 

A.  =  l<i<n. 

Since  is  measurable,  'i  is,  so  that  the  A^’s  are  Borel  sets  which  are  disjoint. 

Lot  ''^Ith  respect  to  P^,  {X^,n£N  }  is  a  family  of  independent 

2  2 

-random  variables.  {X^.nelN  }  is  thus  a  family  of  independent  random 
variables  with  mean  a?  and  variance  2a5,  still  with  respect  to  P^.  By  the  law  of 
large  numbers : 


P^(A^)  =  1,  l<i<n. 

dPf 

We  can  thus  assume  that  — -  is  zero  outside  of  A. 

dP® 

dO  ^  ,  dP. 

Consequently,  f  log  dP.  =  /  log  - -  dP.  . 

H  dQ*  "  H  Pi  dP®  " 

1 


i 
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i  dP 

But  /  log  — I  dP.  =  /  log  ^  dP  =  I(P). 

H  dP*  ^  H  dP® 

1 

n  j  dP. 

Finally,  /  log  dQ  =  J  p  /  log  - - ^  dP 

H  dQ®  i=l  ^  H  Pi  dP®  ^ 

n 

=  I  p.  {-log  p.  +  UP)}  =  H(F)  +  I(P). 


4.  The  case  of  a  general  F 

Lemma  9 :  Let  B  =  {xeH:  ||  x-li|j<a}  and  I  =  {a>0:  axcB).  I  is  an  open  interval. 

proof:  Suppose  aj^<a<a2  and  ^^>^2  ^ 

X  =  (a2-a)/(a2-ap  and  1-A  =  . 

Then  a  =  Xa^  +  (1-X)a2. 

So  |lax-ii|l=  II  [Xaj^+(1-X)a2]x  -  [X+(l-X)  ]Iil|  <  X||  a^x-fi  i|  +  (1-X)  jj  a2X-lt  j|  <  a  . 

Thus  ael  and  I  is  an  interval.  Furthermore  if  ael  and  6=a- ||  ax-l?||,  then, 
for  b<B/l|x|l  , 

II  (a+b)x-H’||  <  II  a^-Pi||  +  b  II  x||  <  ||  ax-lt|l  +B  =  a.  I  is  thus  open. 

Lemma  10 :  Suppose 

{F^,neW  }  converges  weakly  to  F.  Let,  for  BeB[H], 
n  Q  a  n 


Q,(B)  =  ;  P  (B)F(da). 

0  ^ 

Then  {Qj^.nelN}  converges  weakly  to  Q  . 


proof :  Let  g(^  =  /  Ij.(ax)F(da) .  g  is  defined  similarly,  F  replacing  F. 
0  H  n  n 

Then:  Q (B)  =  /  P(dx)  g(5) 

H 

0^(8)  =  I  P(dx)  g„(x). 

H 


and 
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Let  B  be  open:  it  is  a  union  of  open  balls,  so  that  Ig(ax) ,  as  a  function 
of  a,  is  an  open  set  of  the  real  line  (Lemma  9).  Consequently,  by  weak 
convergence,  g(x)  s  lim^  thus,  by  Patou's  lemma,  Q(B)  5  lim^ 

Proposition  7  Let  {Fj^,n€lM}  be  a  sequence  of  discrete  probabilities  with  no 
mass  at  the  origin  which  converges  weakly  to  F,  Then 
I(Q)  s  lim^  H(F^)  +  I(P) 

proof:  I  is  lower  semicontinuous  for  weak  convergence. 

Remark :  Taking  F  to  be  the  uniform  distribution,  one  sees  that  the  bound 

of  Proposition  7  will  not  in  general  be  useful.  This  seems  to  indicate  that 
there  is  little  hope  to  study  the  general  case  of  F  starting  with  finite  dimen¬ 
sional  approximations.  This  is  due  to  the  form  taken  by  I(Q)  in  Proposition  6: 
indeed,  in  general,  H(F^)  does  not  tend  to  H(F) . 
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